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Abstract 

This paper studies two spectrum estimation methods for the case that the samples are obtained at a 
rate lower than the Nyquist rate. The first method is the correlogram method for undersampled data. The 
algorithm partitions the spectrum into a number of segments and estimates the average power within each 
spectral segment. We derive the bias and the variance of the spectrum estimator, and show that there is a 
tradeoff between the accuracy of the estimation and the frequency resolution. The asymptotic behavior of 
the estimator is also investigated, and it is proved that this spectrum estimator is consistent. 

A new algorithm for reconstructing signals with sparse spectrum from noisy compressive measurements 
is also introduced. Such model-based algorithm takes the signal structure into account for estimating the 
unknown parameters which are the frequencies and the amplitudes of linearly combined sinusoidal signals. 
A high-resolution spectral estimation method is used to recover the frequencies of the signal elements, while 
the amplitudes of the signal components are estimated by minimizing the squared norm of the compressed 
estimation error using the least squares technique. The Cramer-Rao bound for the given system model is 
also derived. It is shown that the proposed algorithm approaches the bound at high signal to noise ratios. 
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I. Introduction 

Spectrum estimation from a finite set of noisy measurements is a classical problem with wide 
applications in communications, astronomy, seismology, radar, sonar signal processing, etc. 
Existing spectrum estimation techniques can be categorized as low-resolution and high-resolution 
methods. Low-resolution techniques such as the periodogram and correlogram methods are based on 
estimating the autocorrelation function of a signal. High-resolution techniques such as the multiple 
signal classification (MUSIC) method |l3l and the estimation of signal parameters via rotational 
invariance techniques (ESPRIT) [4] are based on modeling and parameterizing a signal. 

In practice, the rate at which the measurements are collected can be restricted. Therefore, it is 
desirable to make spectrum estimation from measurements obtained at a rate lower than the Nyquist 
rate. In dSj and |l6l, authors have shown that for signals with sparse Fourier representations, the 
Fourier coefficients can be estimated using a subset of the Nyquist samples (samples obtained at 
the Nyquist rate). The existing low- and high-resolution spectrum estimation techniques can be 
generalized for the case that the measurements are obtained at a rate lower than the Nyquist rate 
0, (HI- Similar to the conventional spectrum estimation techniques, the low- and high-resolution 
methods working on undersampled data use the autocorrelation function and the model of the signal, 
respectively. 

In [|71, authors have considered power spectral density (PSD) estimation based on the autocorre- 
lation matrices of the data. We refer to this method as the correlogram for undersampled data, as it 
is able to reconstruct the spectrum from a subset of the Nyquist samples. In this method, samples 
are collected using multiple channels, each operating at a rate L times lower than the Nyquist 
rate. The algorithm partitions the spectrum into L segments, and it estimates the average power 
within each segment. We will later show that increasing the value of L reduces the quality of the 
estimation. Therefore, the parameter L cannot be chosen arbitrarily large, which indicates that this 
method lies in the category of low -resolution spectrum estimation techniques. 

High-resolution spectrum estimation techniques for undersampled data can be obtained by consid- 
ering the signal model. In [8J, two model-based methods have been introduced for recovering sparse 
signals from compressive measurements. These measurements are obtained by correlating the signal 
with a set of sensing waveforms. This is the basic sampling technique in compressive sensing (CS) 
im, ifTOl . where signals with sparse representations are recovered from a number of measurements 
that is much less than the number of the Nyquist samples. In [8], authors consider signals composed 
of linear combinations of sinusoids. This type of signals appear frequently in signal processing and 
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digital communications Q, IfTTI . Albeit these signals generate sparse coefficients by the discrete- 
time Fourier transform (DTFT), their representation in the Fourier basis obtained by the discrete 
Fourier transform (DFT) exhibits frequency leakage. This problem results in the poor performance 
of the conventional CS recovery algorithms that rely on the Fourier basis (see (811). Although these 
signals do not have a sparse representation in the Fourier basis, they possess a sparse model in terms 
of the DTFT. In [|12|. the advantages of taking a signal model into account for signal reconstruction 
have been demonstrated and the name model-based CS has been coined. In |[8l, the model-based 
CS method has been modified for spectral estimation. According to the model-based method, the 
signal is reconstructed in an iterative manner, where at each iteration, a signal estimate is formed 
and pruned according to the model. 

The contributions of this paper are presented in two parts. In the first part, the correlogram 
for undersampled data is analyzed, and in the second part, an improved model-based spectrum 
estimation algorithm for spectral compressive sensing is introduced. We have reported a summary 
of the results in f.l3J and lfT4l . Here, we provide in-depth derivations and present new simulation 
results. 

First, we study the correlogram for undersampled data. We compute the bias of the estimator 
and show that the estimation is unbiased for any signal length. Moreover, the covariance matrix of 
the estimator is derived, and it is proved that the estimation variance tends to zero asymptotically. 
Therefore, the correlogram for undersampled data is a consistent estimator. Using our derivations, 
we show that for finite-length signals, there exists a tradeoff between the estimation accuracy and 
the frequency resolution of the estimator. Specifically, higher resolution reduces the accuracy of 
the estimation. 

In the second part of the paper, we introduce a new CS recovery method. The important difference 
of our method from that of |[8J is the approach used for estimating the amplitudes of the signal 
elements. In |[8l, the unknown amplitudes are estimated using the DTFT, while we estimate the 
amplitudes by minimizing the squared norm of the compressed estimation error. Furthermore, we 
analyze the proposed method, derive the Cramer-Rao bound (CRB) for spectral compressive sensing, 
and show that the proposed algorithm approaches the CRB. 

The rest of the paper is organized as follows. The correlogram for undersampled data is reviewed 
and revised in Section |IIl In Section [nil the bias and the variance of the correlogram for under- 
sampled data estimator are derived. The model-based nested least squares method is introduced 
in Section |IVl and the Cramer-Rao bound for spectral compressive sensing is derived in Section 
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rvl Section |VI] presents some numerical examples on the estimation variance of the correlogram 
method for finite-length signals as well as simulation results for the model-based nested least squares 
algorithm. Finally, Section IVIII concludes the paper. This paper is reproducible research [fTSl and 
the software needed to generate the simulation results will be provided to the IEEE Xplore together 
with the paper upon its acceptance. 

II. Correlogram for Undersampled Data 

Consider a wide-sense stationary (WSS) stochastic process x{t) bandlimited to W/2 with power 
spectral density (PSD) Px{uj). Let x{t) be sampled using the multi-coset (MC) sampler as described 
in Q. Samples are collected by a multi-channel system. The i-th channel (I < i < q) samples 
x{t) at the time instants t = [nL + q)T for n = 0, 1, . . . , — 1, where is the number of 
samples obtained from each channel, T is the Nyquist period (T = l/W), L is a suitable integer, 
and g < L is the number of sampling channels. The time offsets Q (1 < i < g) are distinct 
random positive integer numbers less than L. Let the output of the i-th channel be denoted by 
yi{n) = X {{nL + Ci)T). The i-th channel can be easily implemented by a system that shifts x{t) 
by CjT seconds and then samples uniformly at a rate of \/LT Hz. The samples obtained in this 
manner form a subset of the Nyquist samples. The average sampling rate is q/LT Hz, and it is 
less than the Nyquist rate since q < L. 

Given the MC samples, the PSD of the signal can be estimated by transforming the output 
sequences yi{n) = x {{nL + q)T) into a system of frequency domain equations. Let Yi (^e^'^'^^ and 
X{u) denote the Fourier transform of yi{n) and x{t), respectively. Then, the following relationship 
holds Q 

z{io) = Ts{io). (1) 

Here F G C^^^ and its (i, /)-th element is given as [T]^ ^ = ^e"-'^'^'"^' where L is an odd number 
and nil = —^{L + 1) + /. The vector z{uj) = [zi{u), Z2{ui), . . . , Zq{u)]'^ E contains the ele- 
ments Zi{uj) = e'^w'^Yi (^e^'^^^ I[_iiw_ and the vector s{uj) = [si{u), S2{uj), . . . , sl{uj)]'^ G 
C^^^ contains the elements si{ui) = X — 27r^m/) I^_7rw tlK^ where (■)^ stands for the trans- 
position operator and /[.) represents the indicator function. 

Let Rz G C^^"^ and Rg G C^^^ be the autocorrelation matrices of z{co) and s{uj), respectively. 
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Then, it can be found that 



= Urn \- [ ^ E{z{uj)z^{Lo)}duj 

N^oo N J 

w 

lim i- / " E{s{uj)s''{uj)}du 

N-^OO iV / W 
t/ —7V—T— 



N-^oo N 

= ri?,r^ (2) 

where (■)^ and E{-} stand for the Hermitian transposition and the expectation operators, respec- 
tively. 

Consider partitioning the bandwidth of x{t) into L equal segments. It is shown in [7j that the 
diagonal elements of Rg represent the average power within such spectral segments, and the off- 
diagonal elements are zeros. Thus, the (a, 6)-th element of R^ in ^ can be rewritten as 

L 

1=1 

2 L 



^ ^ 1=1 



The l-th diagonal element of Rs, i.e., [Rs]i,i, corresponds to the average power within the spectral 
segment [ttW — 2tt^I, hW — 27r^(/ — 1)). Note that Rz is a Hermitian matrix with equal diag- 
onal elements. Then, it is sufficient to let the indices a and b just refer to the elements of the upper 
triangle and the first diagonal element of R^. Therefore, there are Q = q(q — l)/2 + 1 equations 
of type ^ {1 < k < Q). In matrix- vector form, ^ can be rewritten as 

u = (4) 

where v = [vi,V2, ■ ■ ■ , vlI'^eM.^^^ consists of the diagonal elements of Rg, u=[ui,U2, ■ ■ ■ , uq]^ G 
([^Qxi ^[^Yi ui = [Rz]i^i and U2, . . . , uq corresponding to the elements of the upper triangle of R^, 
and ^ E C*^^^ with elements given by 

[*]^,; = {W/iye-^^^"'^-'''^"'^. (5) 

Since the elements of v are real- valued, the number of equations in ([H) can be doubled by solving 
u = ^v, where u = [Re{u), Im{u)f e M^Qxi and ^ = [i?e(*), /m(*)]^ G M^Qxl^ 

Suppose ^ is full rank and 2Q > L. Then, v can be determined using the pseudoinverse of ^ 

as 

V = (6) 
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The elements of ii are comprised of the elements of Rz- 

The autocorrelation matrix Rz is not known and has to be estimated. The estimation of Rz can 
be found from a finite number of samples as 

[R.U = 2vr^ Y.y^{n- ^) yl (n - |) (7) 

n=0 

where (■)* denotes the conjugate of a complex number. 

The fractional delays Ca/ L and Cb/L can be implemented by fractional delay filters such as the 
Lagrange interpolator which is a finite impulse response (FIR) filter [[161 • FIR fractional delay filters 
perform the best when the total delay is approximately equal to half of the order of the filter [[TTII . 
The fractional delays c^/L and Cb/L are positive numbers less than one, and the performance of 
the FIR fractional delay filters is very poor with such delays. To remedy this problem, a suitable 
integer delay can be added to the fractional part. Referring to the definition of Rz in ^ and noting 
that 

z{oo)z''{uj) = (^2(u;)e-^'^"^) (e-^'^"^z(u;))'' (8) 

we can rewrite © as 

[Rz]a,b = 



n=0 

where D is a suitable integer number. 

Let ha{n) be the impulse response of a causal filter that delays a signal for Ca/L + D steps. The 
output of the filter can be written as 

y'ain) ^ yain - (| + D)) 

n 

= ^ ya{m)ha{n - m) (10) 

m,=max(0,n— A''h+l) 

where is the length of the filter's impulse response. Using the elements of Rz, the vector u is 
formed as an estimation for u. Next, v (the estimation for v) is formed by replacing u with u in 
© as 

V = (^^^ly^^^^i. (11) 

The elements of v represent an estimation for the average power within each spectral segment. 
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III. Bias and Variance of Correlogram for Undersampled Data 
A. Bias Computation 

Let x{t) be a zero-mean white Gaussian random process with PSD Px{u}) = a^O The estimation 
bias can be found by computing the expected value of i). From ([TT]) we have 

E{v} = {4f^4f)-^4f'^E{ii}. (12) 

In order to determine E{u}, it is required to find the expected value of the real and imaginary 
parts of Rz- The expectation operation can be performed before taking the real or imaginary parts 
of itz, as these operators are linear. Moreover, dH) is used to form Rz- Taking expectation from 
both sides of Q along with using (flOl) results in 

N-l 

NL 

N-l 

NL 



E{[RzU} = 27r|^^E{yf(n)yf H)} 

n=0 

-I^E E E 



n=0 r=max p=max 

(O.n-Affe+l) (Cn-ATft+l) 

ha{n - r)hb{n - p)E{ya{r)yl{p)}. (13) 

The problem is now reduced to finding E{ya{r)yl(p)}, which is obtained as 

E{ya{r)y;{p)} = E{x{{rL + Ca)T)x\{pL + c,)T)] 

= (14) 

for rL + Ca = pL + Ch (or a = b, r = p), and it equals zero otherwise. This results from the fact 
that x{t) is a white process with PSD Pxiy}) = cr^. Applying (fT4l) to (fT3l) . we find that 

E{[RzU} = ^ (15) 

for a y^h, and 

E{[RzU} = ^^j^Yl E 

n=0 r=max 

{0,n-Nh+l) 



w 



= 27^— Haa^ (16) 

'a general signal can be written as a filtered Gaussian process, which is a standard approach for traditional correlogram and 
periodogram analysis as well 1181 . 
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for a = b, where 

, N-l n Nh-l 

^^-nT. E ^'(^-^) = iv S^^-'"^^'^'")- ^^^^ 

n=0 r=max m=0 
(0,n-Afh+l) 

Recalling that the first diagonal element of is used in u and taking the real and imaginary parts 
of (fTSi) and (fT6l) . -E'{'u} can be obtained as 

= 271—Hia^ei (18) 

where ei is a column vector of length g(g — 1) + 2 with all its elements equal to zero except for 
the first element which is 1. The expected value of v can be found using (fT2l) and (fTSi) as 

E{v} = 27T—Hi(r^{^ *)-^* ei. (19) 

We analyze next the asymptotic behavior of the correlogram for undersampled data. First, note 
that Rz is an asymptotically unbiased estimator of R^. To show this, it is enough to take expectation 
of both sides of (|7]) while letting the number of samples tend to infinity. This directly leads to R^ 
without requiring any more computation. Since u consists of the elements of Rz and the operation 
of taking the real and imaginary parts are linear, it follows that ii is also an asymptotically unbiased 
estimator of ii. Furthermore, letting the number of samples tend to infinity in (fT2l) and using 
we find that 

lim E{v} = (#^*)~^*^ lim E{iL} 

= (*^*)-^*^w = V. (20) 

In other words, i) is also an asymptotically unbiased estimator of v. Consider the fact that x{t) has 
equal power in all spectral segments (the elements of v are all the same). Since v is asymptotically 
unbiased, it follows that the elements of hraj^^oo E{v} are also equal. 

Replacing the true values in Q with the estimated values for a = b = 1, taking expectation from 
both sides, and letting the number of samples tend to infinity, we obtain that 

2 L 



lim E{[Rzh,^}= (^) lim E{vi} 

^ ^ 1=1 



— ll lim E{v} (21) 



where vi (\ < I < E) are the elements of v, and 1l is the column vector of length L with all 
its elements equal to 1. Considering normalized fractional delay filters (X^mLo^ ^a("^) ^ ^'^'^ 
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referring to (fTTl ). we also find that 



lim Ha = l. (22) 



Tlierefore, using (fT6l) . we can find tliat 

W 



lim E{[R,],^,} = 27r^a^. (23) 
Combining (|2T1) with (l23l) results in 

lim = — fr'lL. (24) 

Letting the number of samples tend to infinity in (fT9l ) and using (|24l) . we obtain 

lim ^{v} = 27r— *)-^* ei = — a^U. (25) 

N^oo L W 

It follows form (|25l) that all the elements of the first column of ^)~^^ are equal to L/W'^. 
Therefore, (fT9] ) can be simplified as 

E{v} = '^H.aHL. (26) 



Finally, let us define p as 



Note that E{p} = {W/2'n:Hi)E{v} = a^li- Therefore, the l-th element of p (1 < / < L) gives 
an unbiased estimation of the average power in the l-th spectral segment. 



B. Variance Computation 

Theorem 1: The correlogram estimation based on undersampled data is a consistent estimator of 
the average power in each spectral segment. 

Proof: The covariance matrix of the correlogram for undersampled data estimator is given by 

Cp = E{{p- p){p - p)^} = E{pp'^} - pp^ (28) 

where p is a vector of length L consisting of the average power in each spectral segment. For the 
Gaussian signal case, all the elements of p are equal to cr^. It follows from ([TT)) and (l27l) that 

__j u^i^f (29) 

where U = E{iiii'^} e M2Qx2Q_ 
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Computation of the elements of U involves taking expectation of the multiplication of the real 
or imaginary parts of the elements of Rz- We will use the following lemma [|T9l for interchanging 
the expectation and the operation of taking real or imaginary parts. 

Lemma 1. Let x and y be two arbitrary complex numbers. The following equations hold 

Re{x)Re{y) = ^ iRe{xy) + Re{xy*)) (30) 
Ini{x)Ini{y) = —-{Re{xy) — Re{xy*)) (31) 
Re{x)Im{y) = - {Im{xy) — Im^xy*)) . (32) 

The elements of U can be easily obtained using E{[Rz]a,b[Rz]c,d}, E{[Rz]a,b[Rz]*c d}' 
Lemma 1, where [Rz]a,b and [Rz]c,d are the elements of Rz used for forming ii. Using Q and 
(flOl) . we obtain 



E{[Rz]a,b[Rz]c,d} 

E E E{y'Mytin)yt{u)yt{u)} 



n=0 u=0 

2 



^ n u r p s m 

ha{n - r)hi,{n - p)hc{u - s)hd{u - m) 

xE{y,{r)yl{p)y,{s)y*,{m)} (33) 
where and Y.m are notations for Y.n=o^ Eu=o ' =max(0,n— Af/^+l) ' 

p=max(0,n-Afh+l)' 2^s=max{0,u-Nh+l)' I^m=m3.x{0,u-Nh+1)' respectively. 

The expectation operation in (l33l) can be obtained using the forth moment of x(t) as 

El = E{ya{r)y;{p)yc{s)y*a{m)} 

= E{x{{rL + c,)T)x*{{pL + ct)T) 

X x{{sL + Cc)T)x*{{mL + q)T)} 
= cr'^(5(r — p)6{a — b)6{s — m)6{c — d) 

+ 5{r - m)5{a - d)6{p - s)6{b - c)) (34) 
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where 6{-) is the Kronecker deka. In a similar way, E{[Rz]a,b[Rz]*c^d} '■^^^ be obtained as 

E{[Rz]a,b[^z]c,d} 

^ n u r p s m 

hain — r)h}j{n — p)hc{u — s)hd{u — m)E2 (35) 

where E2 is defined as 

E2 = E{ya{r)y;{p)y*{s)yd{m)} 

= cr^(5(r — p)5{a — b)5{s — m)6{c — d) 

+ 5{r - s)5{a-c)5{j)-m)5{b-d)). (36) 
Recalling that only the first diagonal element of Rz is present in u, Ei can be found to be equal 

to 

El = a^(^5{r — p)S{s — m) + S{r — m)5{p — s)) (37) 
for: a = h = c = d = 1, and it equals to zero otherwise. Similarly, E2 can be found to be equal to 

E2 = a^S{r - s)5{p - m) (38) 

for a=c and b=d, and it equals zero otherwise. Noting that E{[Rz]a,b[Rz]c,d} and E{[Rz]a,b[Rz]cd} 
are real-valued and using (|32l) . (l37l) . and (|38l) . we can find that all the off-diagonal elements of U 
are equal to zero. 

Let us start computing the diagonal elements of U by setting a = b = c = d = l.lt follows 
from (133]), dMl), and dlT]) that 

^ ^ n u r s 

hlin-r)hl{u- s) + J2Siin)'^ (39) 

n 

where Si{n) is defined as 

-^ii^) = E E E E E ^(^ - - ^) 

u r p s m 

xhi{n — r)hi{n — p)hi{u — s)hi{u — m). (40) 



It is straightforward to show that for A^/^ — 1 < n < — A^h, Si{n) is given by 

Gi ^ Si{n) = E [hi{i) * hi{Nn - 1 - i)\gf (41) 
9=0 
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where * denotes the convolution operation. Note that Gi is not a function of n. For < n < Nh — 1, 
Si{n) is given by 

n+Nh-l 

Si{n)= Yl [ihi{t)Wn{t))*h{Nf,-l-t)\gf (42) 

9=0 

where Wn{i) is equal to 1 for < i < n and zero elsewhere. For N — < n < N — 1, Si{n) is 
given by 

N-n+Nf,-2 

Siin)= Yl M^)*h{N,,-l-^)\g]\ (43) 

9=0 

Next, (|39l ) can be rewritten as 



NL 



n r us 

+ {N - 2Nn + 2)Gi + Y "^iH + E "^iH) (44) 

n=0 n=Ar-Ar^+l 

and simplified using (flTI) as 

X (^AT^i/^ + (AT - 2Ar;, + 2)Gi + Si) (45) 

where Si ^ ^^^"^ Si(n) + ES-7v,+i Note that [^^]i,i is real-valued. Therefore, [t/]i,i 

is equal to _E'{[^;3]i i[^;3]i i} as given in (l45l) and [C/]q+i,q+i equals zero since the imaginary part 
of [i?z] 1,1 is zero. 

For the rest of the diagonal elements of U, E{[Ilz]a,b[Rz\a,b] equals zero, as Ei is zero. 
Therefore, \U]k^k {2 < k < 2Q and k ^ Q + 1) can be obtained using (|30l) and (|3TI) as 

[C/]fc,, = ii?e(E{[K,],,4^.];j). (46) 

From (|35l) and (l38l) we have 

[C/],,, = V ('2vr|^') 5^5,(n) (47) 

where Skin) is defined as 

s^n) ^ E E E E E ^(^ - ^)'^(^' - 

^hain — r)hb{n — p)ha{u — s)hb{u — m). (48) 
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It is again straightforward to show that for Nh — 1 < n < N — N^, Sk{n) is given by 

9=0 

x{h{t)*h{Nf,-l-i))\g. (49) 

For < n < Nh — I, Sk{n) is given by 

Sk{n)= J2 iiha{i)Wn{i))*K{Nh-l-i))\g 

9=0 

X {{h{l)Wn{l)) * h{Nh - 1 - Z)) \g. (50) 

For N — Nh<n<N — 1, Sk{n) is given by 

N-n+Nh-2 

Sk{n)= J2 iha{i)*haiNh-l-i))\, 

9=0 

x{h{i)*h{Nh-l-i))\g. (51) 



Thus, (l47l) can be rewritten as 



1 . W^^ 



[U]k,k = ^CT^ [271— j {{N-2Nh + 2)Gk + J:k) (52) 

where = E!So^ Sk{n) + T.n=N-N,,+i ^kin). 

All the elements of the matrix U are determined, and thus, the covariance matrix of the correl- 
ogram for undersampled data can be obtained from (l28l) and (|29l ). 

We analyze next the asymptotic behavior of the correlogram for undersampled data. Letting the 
number of samples tend to infinity in (l28l) yields 



lim Cp = lim E{pp } — pp . (53) 



From (|22|) and (|29l ), we obtain 



lim E{pp } 

N^oo 



— (* *)-i* ( lim t/) *)-\ (54) 

27r / VAf-s-oo / 

Recall that all the off-diagonal elements of U are zeros, and the first diagonal element of U is 
given by (|45] ). Letting the number of samples tend to infinity in (|45] ). we obtain 



lim E{[C/]i,i} = aM27r— . (55) 
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The {Q + l)-th element of U is zero, and if the number of samples tend to infinity in (|52l) . 
limAr^oo[f^]A:,fc = 0. Therefore, all the elements of limAr^oo U are equal to zero except for its first 
diagonal element given by (|55l) . 

In order to further simplify (|54|) . only the elements of the first column of ^)^^ are 
required. We have shown in the previous section that these elements are all equal to L/W"^. 
Therefore, (|54l) can be simplified to 

lim E{pp^} = 

N^oo 

fi!:V..f2.a74Vu.-'u. (56) 



where 1^^ is an L x L matrix with all its elements equal to 1. It follows from (l53l) and (l56l) that 



lim Cp = 0. (57) 

7V-!>oo 

In other words, the variance of the correlogram for undersampled data estimator tends to zero as 
the number of samples goes to infinity, which proves the consistency of the estimator. ■ 

IV. Model-Based Nested Least Squares Method 

The spectral estimation method based on multi-coset sampling as studied in the previous sections 
requires the number of samples to be large. Otherwise, the variance of Rz rises leading to poor 
spectral estimation. Moreover, the frequency resolution of the estimation is limited by the parameter 
L. The value of L cannot be increased arbitrarily, as the total length of the signal is limited in 
practice. Besides, this method is limited to WSS signals. 

In our conference contribution lfT3l . we have introduced an improved model-based spectral 
analysis method using nested least squares, which detects sinusoids from noisy compressive mea- 
surements. For this method, the signals do not need to be WSS in general. The algorithm can 
handle short-length signals, and it provides high resolution spectral estimation. Here we explain 
this method in details and provide its analysis as well as derivations of the CRB. 

Let the signal x = [xq, Xi, . . . , xn^^i]'^ E C'^^^^ be a linear combination of K sinusoids 
(K ^ A'^.) where x„ (0 < n < N.^) are the samples of the signal obtained at the Nyquist rate. Here 
the sample Xn is given by 

K 

a;n = J]4e-^"'=" (58) 

k=l 

where dk and tOk (l < k < K) are unknown amplitudes and frequencies of the K sinusoids, 
respectively. By arranging the amplitude parameters in the vector d = [di, . . . , dx]^ € C^^^ 
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and forming the matrix A = [a{ui), a{u2), ■ ■ ■ , a.(wx)] ^ <C^^^ with the frequency parameters, 
the model (|58l) can be rewritten in the matrix-vector form as 

x = Ad (59) 

where a{uj) = [1, e"^'", e'^^'^, e-^^^-^)^]^ G C^^^ is the Vandermonde vector. 

Let the vector of the measurements y E C^^^^ be given by 

y = + w (60) 

where $ G M^^^^ is the measurement matrix, and w G C^^^^ is the measurement noise with 
circularly symmetric complex normal distribution J\fc{0, a"^!). The elements of the measurement 
matrix $ are drawn independently from, for example, the Gaussian distribution A/'(0, 1/M). 

The goal is to estimate the unknown amplitudes and frequencies of the signal (|59l) from the noisy 
compressive measurements (l60l) . 

Two criteria are taken into consideration for developing the estimation algorithm: minimization 
of the estimation error and matching the estimated signal to the sparsity model. The squared norm 
of the compressed estimation error is ||?/ — $a;||2 where x is the estimated signal. Thus, the problem 
of finding the estimate x can be formulated as 

X = arg min \\y — ^x\\l. (61) 

X 

The estimation error is a convex function, and the minimization of (|6TI) can be obtained using the 
least squares technique with the iterative solution 

Xi = Xi_i + A$^(2/ - ^Xi_i) (62) 

where Xi is the estimated signal at the ith iteration and A represents the step size of the LS algorithm 
or equivalently the scaling factor for the residual signal of the previous iteration, that is, y — <^Xi_i. 

The LS problem of (|6TI) is underdetermined and has many solutions. In order to match the 
estimated signal to the model in (|58l) . a pruning step is inserted in the iterative solution of (|62l) . 
Specifically, let x^ = Xj_i + A$^(t/ — $a;j_i), then the frequencies cui, uj2, ■ ■ ■ , cok can be estimated 
from x^ using, for example, the root-MUSIC technique [20] . This method needs the knowledge of 
the autocorrelation matrix of the data Rx for estimating the frequencies. 

Consider windowing x^ by overlapping frames of length W^. Then, the elements of can be 
estimated as 
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where Rx is an estimation for R^, (0 < n < A^^.) are the elements of x'^, and 1 < a, 6 < W^. 
Let il, = [coi,uj2, . . . , cokV ^ [~^' tt]^^^^ be the vector of the estimated frequencies. Then, the 
estimate of the Vandermonde matrix A (denoted by A) can be straightforwardly computed based 
on O. 

Recalling the objective of minimizing the squared norm of the compressed estimation error, the 
vector of the amplitudes d can be estimated by minimizing \\y — ^Ad\\l. The solution for this 
problem is given by 

d, = {B''B)-'B''y (64) 

where B = $A. Note that in [[8l, the amplitudes are estimated as 

d, = A^x^ (65) 

where is the vector of the estimated amplitudes at the i-th iteration. The algorithm based on 
(|65l) is referred to as spectral iterative hard thresholding (SIHT) via root-MUSIC. 

Finally, Xi can be obtained using the estimated frequencies and amplitudes as Xi = Adi. The 
steps of the algorithm are summarized in Algorithm 1. 

The algorithm consists of the outer and the inner least squares steps along with the root-MUSIC 
method. In each iteration, the algorithm converges to the true signal in three steps. First, the outer 
least squares makes an estimation of the subspace in which the original signal lies. This is done by 
minimizing the squared norm of the compressed estimation error. Note that due to the fact that the 
problem is underdetermined, the signal x cannot be estimated, but only an improved estimate of the 
subspace to which the signal x belongs can be found. Then, the signal estimate x is enhanced in 
the second and the third steps of the algorithm. In the second step, the estimation is forced to match 
the signal model by applying the root-MUSIC method. The frequencies are estimated at this stage. 
Note that each frequency represents one of the dimensions of the signal subspace. In the first few 
iterations of the algorithm, some of the frequencies might be estimated incorrectly, as the output 
of the outer least squares step might not be close enough to the true signal subspace. In the third 
step of the algorithm, the amplitudes are estimated by applying the inner least squares. The last 
two steps are building the signal subspace according to the signal model, and then, estimating the 
projection coefficients for each dimension of the subspace. Finally, the estimated signal is fed back 
to the outer least squares step for the next iteration. The algorithm continues until some stopping 
criterion is satisfied. For example, the criterion can be satisfied when a predetermined fixed number 
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Algorithm 1 

Initialize: a;o = 0, i — 1 
repeat 

+ A*^(j/ - *a;i_i) 
S7 ^ root-MUSIC(a;% A') 
A <— [a{uji) a{LLj2) ... a(tliK)] 

Xi <— Adi 
i <~ i + 1 
until stopping criterion is satisfied 



of iterations is performed or the normalized compressed estimation error i\\y — ^x\\l/\\y\\l) is less 
than a given threshold value. 



V. Cramer-Rao Bound for Spectral Compressive Sensing 

The Cramer-Rao bound (CRB) for the problem of estimating the parameters of multiple super- 
imposed exponential signals in noise has been derived in |fT9l . In this section, the CRB for spectral 
compressive sensing is derived by considering the system model (l59l) and (l60l) . 



First, let the vector of parameters be defined hy = [d d Q, Y where d and d represent the 
real and imaginary parts of d, respectively. Furthermore, let 

di 

diag(d) 



D 







d 



K 



(66) 



and G = [g{coi) ■ ■ ■ g{ujK)] where g{co) = da{co)/duj. 
The likelihood function of the measurement vector y is 



^ exp{-^{y-Bd)''iy-Bd)} 



r2\M 



where B = $A. 

The inverse of the Fisher information matrix is given by 

I-\0) = iE{^P^P^})-' 



(67) 



(68) 



where -0 = dhiL/dd. 
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The log-likelihood function is 



InL = -M In 77 - Mlna^ 



y - Bdfiy - Bd) 



a" 



and its derivatives with respect to d and d are 

d\aL 
dd a 

and 

ainL 1 



[-jB^w + i {w^bY^ = —Im {B^w} 



d~d an ^ ' ^ ^ W 

respectively. Recall that w = y — Bd is the measurement noise introduced in (|60l) . 
The derivative of the log-likelihood function with respect to Wfc for 1 < A; < is 

d\nL 



(69) 



(70) 



(71) 



—Re \ d''-—w 



2 



Re{dlg''{oOk)^''w} 



(72) 



The derivatives of the log-likelihood function with respect to the frequencies can be written in the 
matrix form as 



dlnL _ 2 



(73) 



To proceed, we use the extension of Lemma 1 to the vector case. Then, the submatrices of 1(0) 
can be computed as 



E 



dlnL 
dd 



dlnL 
dd 



4 1 



--Re{E{B^ww^B}} 



a* 2 
2 

~2 



a 



Re{B^B] 



E 
E 
E 
E 
E 



'dlnL 




'dlnL' 


T 


dd 




dd 




'dlnL' 




'dlnL' 


T 


dd 




. 9ft _ 




'dlnL' 




'dlnL' 


T 


dd 




dd 




'dlnL' 




'dlnL' 


T 


dd 




. dft 




'dlnL' 




'dlnL' 


T 


dft 




dft 





■—Im{B^B] 



a- 



:Re{B^^GD] 



^,Re{B-B] 



a 



a- 



:Im{B"^GD} 



(74) 
(75) 

(76) 

(77) 

(78) 



—Re \D^G"^^^GD] 

fj2 I. J 



(79) 
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Note that E {ww'^} 



0. Then, I{d) is given by 

m = 



F 
F 



A A 



F A 
F A 
A 



(80) 



where (•) and (•) stand for the real and imaginary parts of a matrix, and 

2 



F 
A 



2 
2 



B^^GD 



fj2 1- J 



(81) 
(82) 
(83) 



The signal x can be considered as a function of 0, and therefore, the covariance matrix of any 
unbiased estimator of x, that is, Cx, satisfies the inequality 



Moreover, the signal x can be written as 

X = Ad + jAd = dia{uji) + . . . + dKO,{ujK)- 



(84) 



(85) 



Then the derivative of x with respect to the whole vector of unknown parameters 6 can be found 
as 

dx 



dO 



[A J A dig{ui) ... dKgiuiK)] 
[A jA GD]. 



Finally, by summing over the diagonal elements of (|84l) . we obtain 



E\ lla;-a;||2 



(86) 



(87) 



VI. Numerical Examples and Simulation Results 

In this section, we investigate the behavior of the correlogram method for finite-length signals 
based on the analytical results obtained in Section Hill We next present the simulation results for 
the model-based nested least squares algorithm for spectral compressive sensing. 
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A. Numerical Examples for the Correlogram for Undersampled Data Method 

The estimation variance of the correlogram method depends on the number of sampling channels 
q, the number of spectral segments L, and the signal length N^. Here, the power of the signal is 
set to 0"^ = 4 and the Nyquist sampling rate is considered to he W = 1000 Hz. The time offsets 
Ci {I < i < q) are found randomly for each (L, g)-pair and kept unchanged for different signal 
lengths. 

Fig. [T] depicts the variance of the estimator [Cp]i i versus the length of the Nyquist samples for 
different values of (L, g)-pairs. The average sampling rate is also given by qW/L. From the curves 
corresponding to the (51, 12), (101, 25), and (201, 50)-pairs, it can be seen that the performance of 
the estimator degrades when increasing the number of spectral segments L, i.e., when increasing 
frequency resolution. The average sampling rate is kept almost the same in this scenario. Consider 
next the case when the frequency resolution L is the same but the average sampling rate is different. 
It can be seen from the curves corresponding to the (101, 20) and (101, 25)-pairs that the estimation 
variance is lower when the average sampling rate is higher. 

B. Simulation Results for the Model-Based Nested Least Squares Algorithm 

In the simulations for the model-based nested least squares algorithm for spectral compressive 
sensing, we consider a signal consisting of = 20 complex-valued sinusoids with a length of 

= 1024 samples. The frequencies (ui, U2, ■ ■ ■ , ojr) are drawn randomly from the [0, 2n) interval 
with the constraint that the pairs of the frequencies are spaced by at least lOir/N radians/sample. 
Furthermore, the amplitudes (si, S2, ■ ■ ■ , sk) are uniformly drawn at random from the [1, 2] interval. 
In all our simulations, the step size of the outer LS algorithm is set to 1 (A = 1). 

The normalized mean squared error (NMSE) is defined as 

NMSE = 10 log (^^^f^) . (88) 
Recalling dST]), the normalized CRB (NCRB) is defined as 

The first experiment explores the performance of the model-based nested LS and the SIHT via 
root-MUSIC algorithms [8] over 10 iterations. The number of measurements is set to 300 (M = 300) 
and the noise standard deviation to 2 (cr = 2). The simulation results are illustrated in Fig. [21 It 
can be seen that the model-based nested LS algorithm converges after 5 iterations, while the SIHT 



February 14, 2012 



DRAFT 



ffiEE TRANSACTIONS ON SIGNAL PROCESSING 



21 



via root-MUSIC method requires more iterations to converge. At the 5th iteration, the proposed 
algorithm performs 3 dB better than the SIHT method, and it is 1 dB away from the NCRB. After 
10 iterations, the algorithm still performs 1 dB better than the SIHT method. 

Next, the performance of the algorithm is investigated for a range of noise variances. The result 
of the second experiment is depicted in Fig. [3l The number of measurements is set to 300 (M = 
300) and the number of iterations of the algorithm to 10. Similar to the previous example, the 
proposed method outperforms the SIHT via root-MUSIC algorithm. Moreover, it can be seen that 
the performance of the proposed algorithm approaches the bound at high signal to noise ratio values. 

The third experiment investigates the performance of the algorithms for different numbers of 
measurements. The noise standard deviation is set to 2 (cr = 2), and the number of iterations is set 
to 10. The results are shown in Fig. |4l It can be seen that with 200 measurements, the proposed 
algorithm is able to recover the signal, while the performance of the SIHT method is significantly 
far from the NCRB. For larger number of measurements, the model-based nested LS algorithm 
performs about 1 dB better than the SIHT via root-MUSIC method, and it is about 1 dB away 
from the NCRB. 

Finally, we investigate the convergence of the proposed algorithm by counting the number of 
missed signal frequencies over the iterations of the algorithm. A frequency of the true signal is 
considered as missing in the estimated signal when the root-MUSIC algorithm does not output any 
frequency within a distance of less than 5n/N radians/sample (which is the resolution limit under 
the simulation set-up) to the true frequency. The number of the missed signal frequencies in the 
root-MUSIC algorithm can be a measure of the subspace swap phenomenon, when a number of 
the vectors between the estimated signal and noise subspaces are switched [|2T]|. [|22l|. 

The average number of missed signal frequencies over 4 iterations is presented in Table HI The 
number of the measurements is set to 300 (M = 300). It can be seen that after 3 iterations, the 
root-MUSIC algorithm is able to find all the signal frequencies (for a = 2, 3, and 4). This indicates 
that the outer LS step of the algorithm is converging to the true signal subspace, as the root-MUSIC 
algorithm is able to distinguish more accurately between the signal and noise subspaces. 

VII. Conclusion and Discussion 

We have considered two spectrum estimation techniques for undersampled data: the correlogram 
method which estimates the spectrum from a subset of the Nyquist samples and the model-based 
nested least squares algorithm which works with compressive measurements. 
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Iteration number 


1 


2 


3 


4 


a = 2 


2.89 


0.05 








<T = 3 


3.11 


0.12 








CT = 4 


3.24 


0.23 


0.01 






The correlogram estimation method for low-resolution spectral estimation has been analyzed in 
this paper by computing the bias and the variance of the estimator. It has been shown that the 
estimator is unbiased for any signal length, and it has been proven that the variance of the method 
tends to zero asymptotically. Therefore, this method is a consistent estimator. The behavior of the 
estimator for finite-length signals has been also investigated, and it has been illustrated that there 
is a tradeoff between the accuracy of the estimator and the frequency resolution. It has been shown 
that at a fixed average sampling rate, the performance of the estimator degrades for the estimation 
with higher frequency resolution. Furthermore, for a given frequency resolution, the performance 
improves by increasing the average sampling rate. 

In the second part of the paper, we introduced a new signal recovery algorithm for model-based 
spectral compressive sensing for high-resolution spectral estimation. We considered a general signal 
model consisting of complex-valued sinusoids with unknown frequencies and amplitudes. Although 
the signal model is inherently sparse, its representation in the Fourier basis does not offer much 
sparsity. For this reason, the conventional CS recovery algorithms do not perform well for such 
signals. 

The proposed algorithm estimates the signal iteratively by performing three steps at each iteration. 
First, the outer least squares makes an estimation of the subspace in which the original signal 
lies. This is done by minimizing the squared norm of the compressed estimation error. Next, the 
unknown frequencies are estimated using the root-MUSIC algorithm. Then, the amplitudes of the 
signal elements are estimated by the inner least squares, and the result is fed back to the outer least 
squares for the next iteration. 

The Cramer-Rao bound for the given signal model has been also derived. Finally, the simulation 
results have been presented, and it has been shown that the proposed algorithm is able to converge 
after 5 iterations for the given settings and it approaches the CRB at high signal to noise ratio 
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-^(Uq)=(51,12) qW/L=235Hz 

(L,q)=(1 01 ,25) qW/L=247 Hz 

-B-(L,q)=(101,20) qW/L=198Hz 

-e- (L,q)=(201 ,50) qW/L=248 Hz 




Fig. 1. Variance versus Nyquist signal length. 
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Fig. 2. Normalized mean squared error versus iteration number for cr = 2 and M = 300. 
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Noise Standard deviation (a) 



Fig. 3. Normalized mean squared error versus noise standard deviation after 10 iterations for M = 300. 
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Fig. 4. Normalized mean squared error versus number of measurements after 10 iterations for a = 2. 
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